419 research outputs found
A second order cone formulation of continuous CTA model
The final publication is available at link.springer.comIn this paper we consider a minimum distance Controlled Tabular Adjustment (CTA) model for statistical disclosure limitation (control) of tabular data. The goal of the CTA model is to find the closest safe table to some original tabular data set that contains sensitive information. The measure of closeness is usually measured using l1 or l2 norm; with each measure having its advantages and disadvantages. Recently, in [4] a regularization of the l1 -CTA using Pseudo-Huber func- tion was introduced in an attempt to combine positive characteristics of both l1 -CTA and l2 -CTA. All three models can be solved using appro- priate versions of Interior-Point Methods (IPM). It is known that IPM in general works better on well structured problems such as conic op- timization problems, thus, reformulation of these CTA models as conic optimization problem may be advantageous. We present reformulation of Pseudo-Huber-CTA, and l1 -CTA as Second-Order Cone (SOC) op- timization problems and test the validity of the approach on the small example of two-dimensional tabular data set.Peer ReviewedPostprint (author's final draft
Privacy-Preserving Ridge Regression with only Linearly-Homomorphic Encryption
Linear regression with 2-norm regularization (i.e., ridge regression) is an important statistical technique that models the relationship between some explanatory values and an outcome value using a linear function. In many applications (e.g., predictive modelling in personalised health care), these values represent sensitive data owned by several different parties who are unwilling to share them. In this setting, training a linear regression model becomes challenging and needs specific cryptographic solutions. This problem was elegantly addressed by Nikolaenko et al. in S&P (Oakland) 2013. They suggested a two-server system that uses linearly-homomorphic encryption (LHE) and Yao’s two-party protocol (garbled circuits). In this work, we propose a novel system that can train a ridge linear regression model using only LHE (i.e., without using Yao’s protocol). This greatly improves the overall performance (both in computation and communication) as Yao’s protocol was the main bottleneck in the previous solution. The efficiency of the proposed system is validated both on synthetically-generated and real-world datasets
Structural identifiability of dynamic systems biology models
22 páginas, 5 figuras, 2 tablas.-- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.A powerful way of gaining insight into biological systems is by creating a nonlinear differential equation model, which usually contains many unknown parameters. Such a model is called structurally identifiable if it is possible to determine the values of its parameters from measurements of the model outputs. Structural identifiability is a prerequisite for parameter estimation, and should be assessed before exploiting a model. However, this analysis is seldom performed due to the high computational cost involved in the necessary symbolic calculations, which quickly becomes prohibitive as the problem size increases. In this paper we show how to analyse the structural identifiability of a very general class of nonlinear models by extending methods originally developed for studying observability. We present results about models whose identifiability had not been previously determined, report unidentifiabilities that had not been found before, and show how to modify those unidentifiable models to make them identifiable. This method helps prevent problems caused by lack of identifiability analysis, which can compromise the success of tasks such as experiment design, parameter estimation, and model-based optimization. The procedure is called STRIKE-GOLDD (STRuctural Identifiability taKen as Extended-Generalized Observability with Lie Derivatives and Decomposition), and it is implemented in a MATLAB toolbox which is available as open source software. The broad applicability of this approach facilitates the analysis of the increasingly complex models used in systems biology and other areasAFV acknowledges funding from the Galician government (Xunta de Galiza, Consellería de Cultura, Educación e Ordenación Universitaria http://www.edu.xunta.es/portal/taxonomy/term/206) through the I2C postdoctoral program, fellowship ED481B2014/133-0. AB and AFV were partially supported by grant DPI2013-47100-C2-2-P from the Spanish Ministry of Economy and Competitiveness (MINECO). AFV acknowledges additional funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 686282 (CanPathPro). AP was partially supported through EPSRC projects EP/M002454/1 and EP/J012041/1.Peer reviewe
Secure and scalable deduplication of horizontally partitioned health data for privacy-preserving distributed statistical computation
Background
Techniques have been developed to compute statistics on distributed datasets without revealing private information except the statistical results. However, duplicate records in a distributed dataset may lead to incorrect statistical results. Therefore, to increase the accuracy of the statistical analysis of a distributed dataset, secure deduplication is an important preprocessing step.
Methods
We designed a secure protocol for the deduplication of horizontally partitioned datasets with deterministic record linkage algorithms. We provided a formal security analysis of the protocol in the presence of semi-honest adversaries. The protocol was implemented and deployed across three microbiology laboratories located in Norway, and we ran experiments on the datasets in which the number of records for each laboratory varied. Experiments were also performed on simulated microbiology datasets and data custodians connected through a local area network.
Results
The security analysis demonstrated that the protocol protects the privacy of individuals and data custodians under a semi-honest adversarial model. More precisely, the protocol remains secure with the collusion of up to N − 2 corrupt data custodians. The total runtime for the protocol scales linearly with the addition of data custodians and records. One million simulated records distributed across 20 data custodians were deduplicated within 45 s. The experimental results showed that the protocol is more efficient and scalable than previous protocols for the same problem.
Conclusions
The proposed deduplication protocol is efficient and scalable for practical uses while protecting the privacy of patients and data custodians
On a smoothed penalty-based algorithm for global optimization
This paper presents a coercive smoothed penalty framework for nonsmooth and nonconvex constrained global optimization problems. The properties of the smoothed penalty function are derived. Convergence to an ε -global minimizer is proved. At each iteration k, the framework requires the ε(k) -global minimizer of a subproblem, where ε(k)→ε . We show that the subproblem may be solved by well-known stochastic metaheuristics, as well as by the artificial fish swarm (AFS) algorithm. In the limit, the AFS algorithm convergence to an ε(k) -global minimum of the real-valued smoothed penalty function is guaranteed with probability one, using the limiting behavior of Markov chains. In this context, we show that the transition probability of the Markov chain produced by the AFS algorithm, when generating a population where the best fitness is in the ε(k)-neighborhood of the global minimum, is one when this property holds in the current population, and is strictly bounded from zero when the property does not hold. Preliminary numerical experiments show that the presented penalty algorithm based on the coercive smoothed penalty gives very competitive results when compared with other penalty-based methods.The authors would like to thank two anonymous referees for their valuable comments and
suggestions to improve the paper.
This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT
- Fundac¸ao para a Ci ˜ encia e Tecnologia within the projects UID/CEC/00319/2013 and ˆ
UID/MAT/00013/2013.info:eu-repo/semantics/publishedVersio
Search for a Technicolor omega_T Particle in Events with a Photon and a b-quark Jet at CDF
If the Technicolor omega_T particle exists, a likely decay mode is omega_T ->
gamma pi_T, followed by pi_T -> bb-bar, yielding the signature gamma bb-bar. We
have searched 85 pb^-1 of data collected by the CDF experiment at the Fermilab
Tevatron for events with a photon and two jets, where one of the jets must
contain a secondary vertex implying the presence of a b quark. We find no
excess of events above standard model expectations. We express the result of an
exclusion region in the M_omega_T - M_pi_T mass plane.Comment: 14 pages, 2 figures. Available from the CDF server (PS with figs):
http://www-cdf.fnal.gov/physics/pub98/cdf4674_omega_t_prl_4.ps
FERMILAB-PUB-98/321-
Measurement of the B0 anti-B0 oscillation frequency using l- D*+ pairs and lepton flavor tags
The oscillation frequency Delta-md of B0 anti-B0 mixing is measured using the
partially reconstructed semileptonic decay anti-B0 -> l- nubar D*+ X. The data
sample was collected with the CDF detector at the Fermilab Tevatron collider
during 1992 - 1995 by triggering on the existence of two lepton candidates in
an event, and corresponds to about 110 pb-1 of pbar p collisions at sqrt(s) =
1.8 TeV. We estimate the proper decay time of the anti-B0 meson from the
measured decay length and reconstructed momentum of the l- D*+ system. The
charge of the lepton in the final state identifies the flavor of the anti-B0
meson at its decay. The second lepton in the event is used to infer the flavor
of the anti-B0 meson at production. We measure the oscillation frequency to be
Delta-md = 0.516 +/- 0.099 +0.029 -0.035 ps-1, where the first uncertainty is
statistical and the second is systematic.Comment: 30 pages, 7 figures. Submitted to Physical Review
Search for New Particles Decaying to top-antitop in proton-antiproton collisions at squareroot(s)=1.8 TeV
We use 106 \ipb of data collected with the Collider Detector at Fermilab to
search for narrow-width, vector particles decaying to a top and an anti-top
quark. Model independent upper limits on the cross section for narrow, vector
resonances decaying to \ttbar are presented. At the 95% confidence level, we
exclude the existence of a leptophobic \zpr boson in a model of
topcolor-assisted technicolor with mass M_{\zpr} 480 \gev for natural
width = 0.012 M_{\zpr}, and M_{\zpr} 780 \gev for =
0.04 M_{\zpr}.Comment: The CDF Collaboration, submitted to PRL 25-Feb-200
Double Diffraction Dissociation at the Fermilab Tevatron Collider
We present results from a measurement of double diffraction dissociation in
collisions at the Fermilab Tevatron collider. The production cross
section for events with a central pseudorapidity gap of width
(overlapping ) is found to be [] at [630]
GeV. Our results are compared with previous measurements and with predictions
based on Regge theory and factorization.Comment: 10 pages, 4 figures, using RevTeX. Submitted to Physical Review
Letter
A Measurement of the Differential Dijet Mass Cross Section in p-pbar Collisions at sqrt{s}=1.8 TeV
We present a measurement of the cross section for production of two or more
jets as a function of dijet mass, based on an integrated luminosity of 86 pb^-1
collected with the Collider Detector at Fermilab. Our dijet mass spectrum is
described within errors by next-to-leading order QCD predictions using CTEQ4HJ
parton distributions, and is in good agreement with a similar measurement from
the D0 experiment.Comment: 18 pages including 2 figures and 3 tables. Submitted to Phys. Rev. D
Rapid Communication
- …